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Abstract 

In the distributed coding of correlated sources, the problem of characterizing the joint 
probability distribution of a pair of random variables satisfying an n-letter Markov chain 
arises. The exact solution of this problem is intractable. In this paper, we seek a single- 
letter necessary condition for this n-letter Markov chain. To this end, we propose a new 
data processing inequality on a new measure of correlation by means of spectrum analysis. 
Based on this new data processing inequality, we provide a single-letter necessary condition 
for the required joint probability distribution. We apply our results to two specific examples 
involving the distributed coding of correlated sources: multi-terminal rate-distortion region 
and multiple access channel with correlated sources, and propose new necessary conditions 
for these two problems. 
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1 Problem Formulation 



In this paper, we consider a pair of correlated discrete source sequences with length n, 
(t/", V") = {{Ui, Vi), . . . , (f/„, Vn)}, which are independent and identically distributed (i.i.d.) 
in time, i.e., 

n 

p{u-,v'^) = l[p{u.,v.) (1) 
1=1 

and 

p{ui,Vi) = p{u,v), i = l,...,n (2) 

where the single-letter joint distribution p{u, v) is defined on the alphabet UxV. Let (Xi, X2) 
be two random variables such that (Xi, X2, t/", V^") satisfies 

jo(xi,X2,M",t;") = p(M",t;")p(xi|M")p(x2|t;") (3) 

Xi — ^ f/" — > — > X2 

This Markov chain appears in some problems involving the distributed coding of correlated 
sources. For example, in distributed rate-distortion problem [4-6], {Xi,X2) is used to recon- 
struct, (?7", V"), an estimate of the sources ([/", V"-), and in the problem of multiple access 
channel with correlated sources [7,8], (Xi,X2) is sent though a multiple access channel in 
one channel use. Although these specific problems have been studied separately in their own 
contexts, the common nature of these problems, the distributed coding of correlated sources, 
enables us to conduct a general study, which will be applicable to these specific problems. 

The study of the converse proofs of (or the necessary conditions for) the above specific 
problems raises the following questions. We know that the correlation between {Xi,X2) is 
limited, if a single-letter Markov chain Xi — > U — > V — > X2 is to be satisfied. With the 
help of more letters of the sources, i.e., Xi — > — > — > X2 with n larger than 1, the 
correlation between (Xi,X2) may increase. The question here is how correlated {Xi,X2) 
can be, when n goes to infinity. More specifically, can they be arbitrarily correlated? If not, 
then, how much extra correlation can {Xi,X2) gain when n goes from 1 to oo7 To answer 
these questions, we need to determine the set of all "valid" joint probability distributions 
p{xi, X2), if Xi — > [/" — > V"^ — > X2 is to be satisfied with n going to infinitjQ, i.e., 

Sx,x, = 0:2) : Xi ^ f/" ^ X2, n cx)} (4) 

^Xi — /i(?7") and X2 = /2(V^") is a degenerate case. 

^We are also interested in determining the set of all "valid" probability distributions p{xi,X2,ui,vi), or 
the set of all "valid" probability distributions p{xi, X2, ui, U2, vi, V2), etc., if this Markov chain constraint is 
to be satisfied. 



or 



equivalentljll]. 
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We note that it is practically impossible to exhaust the elements in the set 3x^X2 by 
searching over all conditional distribution pairs {p{xi\u'^),p{x2\v"')) when n oo. In other 
words, determining the set of all possible probability distributions p{xi,X2) satisfying the 
n-letter Markov chain, i.e., the set SxiX2J seems computationally intractable. To avoid this 
problem, we seek a single-letter necessary condition for the above n-letter Markov chain. The 
resulting set, characterized by computable single-letter constraints, will contain the target 
set SxiX2- 

The most intuitive necessary condition for a Markov chain is the data processing inequal- 
ity [9, p. 32], i.e., if Xi — ^ f/" — ^ 1/" — > X2, then 

J(Xi; X2) < /(f/"; V") = nI{U; V) (5) 

Since I{U"', V"') increases linearly with n, the constraint in ([5]) will be loose when n is 
sufficiently large. Although the data processing inequality in its usual form does not prove 
useful in this problem, we will still use the basic methodology of employing a data processing 
inequality to find a necessary condition for the n-letter Markov chain under consideration. 
For this, we will introduce a new measure of correlation, and develop a new data processing 
inequality based on this new measure of correlation. 

Spectrum analysis has been instrumental in the study of some properties of pairs of 
correlated random variables, especially, those of i.i.d. sequences of pairs of correlated random 
variables, e.g., common information in [10] and isomorphism in [11]. In this paper, we use 
spectrum analysis to introduce a new data processing inequality, which provides a single- 
letter necessary condition for the joint distributions satisfying the n-letter Markov chain. 

2 Main Results 
2.1 Some Preliminaries 

In this section, we provide some basic results which will be used in our later development. 
The concepts used here are originally introduced by Witsenhausen in [10] in the context of 
operator theory. Here, we focus on the finite alphabet case, and derive our results by means 
of matrix theory. 

We first introduce our matrix notation for probability distributions. For a pair of discrete 
random variables X and Y, which take values in X and 3^, respectively, the \X\ x joint 
probability distribution matrix Pxy is defined as 

PxY{^,J)=Pr{X = x„Y = y,) (6) 

where Pxvihj) denotes the {i,j)-th element of the matrix Pxy- The marginal distribution 
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matrix of a random variable X, Px-, is defined as a diagonal matrix with 



Px{i,i) = Pr{X = X,) (7) 
and the vector- form marginal distribution, px, is defined ajf| 

px{i) = Pr{X = Xi) (8) 

or equivalently px = Px^, where e is the vector of all ones, px can also be defined as 
Px — PxY for some degenerate random variable Y whose alphabet size |D^| is equal to one. 
For convenience, we define 

pi 4 Pie (9) 



For conditional distributions, we define matrix Pxy\z as 

PxY\zit,j) = Prix = x,,Y = yj\Z = z) (10) 

The vector-form conditional distribution px\z is defined as 

PxU^ = Pr{X = Xi\Z = z) (11) 

or equivalently, px\z{i) — Pxy\z for some degenerate random variable Y whose alphabet size 
1 3^ I is equal to one. 

We define a new matrix, Pxy, which will play an important role in the rest of the paper, 

as 

Pxy = Px ^ PxyPy ^ (12) 

Since px — Pxy for some degenerate random variable Y whose alphabet size \y\ is equal to 
one, we define 

Px = P'^PxyPP = Px^Px = 4 (13) 

The counterparts for conditional distributions, Pxy\z and px\yi can be defined similarly. 

A valid joint distribution matrix, PxY) is a matrix whose entries are non-negative and 
sum to 1. Due to this constraint, not every matrix will qualify as a Pxy corresponding to a 
joint distribution matrix as defined in (|T2l) . A necessary and sufficient condition for Pxy to 
correspond to a joint distribution matrix is given in Theorem [1] below, which identifies the 
spectral properties of Pxy- Before stating the theorem, we provide a lemma and a definition 
regarding stochastic matrices, which will be used in the proof of the theorem. 



^In this paper, we only consider the case where px is a positive vector. 
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Definition 1 [12, p. 48] A square matrix T of order n is called (row) stochastic if 

n 

T{t,j)>0 t,j = l,...,n, J]T(^,j) = 1 t = l,...,n (14) 

i=i 

Lemma 1 [12, p. 49] The spectral radius of a stochastic matrix isl. A non-negative matrix 
T is stochastic if and only if e is an eigenvector of T corresponding to the eigenvalue 1 . 

Theorem 1 A non-negative matrix P is a joint distribution matrix with marginal distribu- 
tions Px and Py, i-c, Pe = px — Px^ and P^e = pv — Py^, if and only if the singular 

_i _i 

value decomposition (SVD) of the non-negative matrix P = Px^ PPy ^ satisfies 

I 

P = MAN^ = 4(4)"^ + J2 ^^^^'"^ (15) 

1=2 

where M = [ni, . . . , ni] and N = [ui, . . . ,Ui] are two unitary matrices, A = diag[Ai, . . . , A;] 

1 i_ 

and I = min(|A:'|, |3^|); /^i = p'x, vi = Py, and Ai = 1 > A2 > • ■ ■ > A; > 0. That is, all of 

the singular values of P are between and 1, the largest singular value of P is 1, and the 

1 1 

corresponding left and right singular vectors are p^ and py- 



Proof: Let P satisfy then 

PlPP^e = P] [piiplf + 5^ \^^Ji^vJ^ pI 

I 

i=2 

= Px (16) 

i ~ i i ~ i 

Similarly, e^P^PPy = Py- Thus, the non-negative matrix P^PPy is a joint distribution 

matrix with marginal distributions px and py. 

Conversely, we consider a joint distribution P with marginal distributions px and py. 

We need to show that the singular values of P lie in [0, 1], the largest singular value is equal 

to 1, and Px and py, respectively, are the left and right singular vectors corresponding to 

the singular value L To this end, we first construct a Markov chain X ^ Y ^ Z with 

PxY = PzY = P (this construction comes from [10]). Note that this also implies Px = Pz, 

PxY = PzY = P, and Px\y = Pz\y- The special structure of the constructed Markov chain 
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provides the following: 



Px\z — Px\yPy\z 
= Px\yPy\x 



PPy^P^Px^ 

pliPpppp)iPyh^pp)pr-' 

1 ~ ~ ^ _ 1 



1 

X ' 



= P^PP^P-^ (17) 

which implies that the matrix Px\z is similar to the matrix PP^ [13, p. 44]. Therefore, all 
the eigenvalues of Px\z are the eigenvalues of PP^ as well, and if is a left eigenvector of 
Px\z corresponding to an eigenvalue A, then P^u is a left eigenvector of PP^ corresponding 
to the same eigenvalue. 

We note that Px\z is a stochastic matrix, therefore, from Lemma [H e is a left eigenvector 
of Px\z corresponding the eigenvalue 1, which is equal to the spectral radius of Px\z- Since 
Px\z is similar to PP^, we have that Px is a left eigenvector of PP^ with eigenvalue 1, 
and all the eigenvalues of PP^ lie in [—1,1]. In addition, PP^ is a symmetric positive 
semi-definite matrix, which implies that the eigenvalues of PP^ are real and non-negative. 
Since the eigenvalues of PP^ are non-negative, and the largest eigenvalue is equal to 1, we 
conclude that all of the eigenvalues of PP^ lie in the interval [0, 1]. 

The singular values of P are the square roots of the eigenvalues of PP^, and the left 

singular vectors of P are the eigenvectors of PP^. Thus, the singular values of P lie in [0, 1], 

1 

the largest singular value is equal to 1, and Px is a left singular vector corresponding to the 
singular value 1. The corresponding right singular vector is 

= tllP = {plfPpPPy^^ = e^PPy^^ = plPP = {pif (18) 

which concludes the proof. ■ 

This theorem implies that there is a one-to-one relationship between P and P. It is easy 
to see from f|T2|) that there is a unique P for every P. Conversely, any given P satisfying 
( IT5|) gives a unique pair of marginal distributions {Px, Py), which is specified by the left and 
right positive singular vectors corresponding to its largest singular valu^. Then, from ( fT2|) . 



using P and {Px, Py) given by its singular vectors, we obtain a unique P as 

p = P^PP^ (19) 
Because of this one-to-one relationship, exploring all possible joint distribution matrices P 



"^We observe that there may exist multiple singular values equal to 1, but /xi and f i are the only positive 
singular vectors. 
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is equivalent to exploring all possible non- negative matrices P satisfying (fT5|) . 

Here, A2, . . . , A; can be viewed as a group of quantities, which measures the correlation 
between random variables X and Y . We note that when A2 = ■ ■ ■ = = 1, X and Y are 
fully correlated, and, when A2 = ■ ■ ■ = A^ = 0, X and Y are independent. In all the cases 
between these two extremes, X and Y are arbitrarily correlated. Moreover, Witsenhausen 
showed that X and Y have a common data if and only if A2 = 1 [10]. In the next section, 
we will propose a new data processing inequality with respect to these new measures of 
correlation, A2, . . . , A;. By utilizing this new data processing inequality, we will provide a 
single- letter necessary condition for the n-letter Markov chain Xi — > f/" — > — > X2. 

2.2 A New Data Processing Inequality 

In this section, first, we introduce a new data processing inequality in the following theorem. 
Here, we provide a lemma that will be used in the proof of the theorem. 

Lemma 2 p- 178] For matrices A and B 

HAB) < UA)\,{B) (20) 
where Aj(-) denotes the i-th largest singular value of a matrix. 
Theorem 2 IfX^Y^Z, then 

HPxz) < UPxy)HPyz) < UPxy) (21) 
where i = 2, . . . , rank(Pxz) ■ 

Proof: From the structure of the Markov chain, and from the definition of Pxy in f|T2l) . we 
have 

_ 1 _i 
Pxz = Px ^ PxzPz ^ 

_1 _1 _1 1 

TD 2 TD TD 2 TD 2 TD P 2 

— J^x J^XYJ^Y ^Y ^YZJ^z 

= PxyPyz (22) 

Using (ITSl) for Pxzi we obtain 

Pxz =vUvlf + ^iiPxz)t^iiPxz)iyi{Pxzf (23) 

i=2 
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and applying (fT5|) to Pxy and Pyz yields 
PxyPyz 

pkpiV + Y. HPxy)^l^{Pxy>^{Pxyf ] [ pUptf + E HPYz)^l^{Pyz)MPYz: 



T 



i=2 / \ 1=2 

I 



--pUpIY + HPxy)^Ji^{Pxy)l^^{Pxyf Yl HPy z) ^l^{Py z)ui{PY zf (24) 



.i=2 I \i=2 



where the two cross-terms vanish because py plays the roles of both iyi{Pxy) and ^tl(fVz), 
and therefore, py is orthogonal to both Vi^Pxy) and fij^Pyz), for all i,j 7^ 1. Using (l22l) 
and equating (!23|) and (12^ . we obtain 

Y^^(^^z)^^^{Pxz)MPxzf 

Y K{Pxy)MPxy)l^^{PxyV) (y MPyz)MPyz)ly^{Pyzf] (25) 



i=2 



J=2 / \i=2 



The proof is completed by applying Lemma [2] to (12^ and also by noting that A2 (-Pyz) < 1 
from Theorem [TJ ■ 

Theorem [2] is a new data processing inequality in the sense that the processing from Y 
to Z reduces the correlation measure Aj, i.e., the correlation between X and Z, Xi{Pxz), is 
less than or equal to the correlation measure between X and Y, Xi{Pxy)- We note that this 
theorem is similar to the data processing inequality in [9, p. 32] except instead of mutual 
information, we use Aj(-Pxy) as the correlation measure. In the sequel, we will show that 
this new data processing inequality helps us develop a necessary condition for the n-letter 
Markov chain while the data processing inequality in its usual form [9, p. 32] is not useful 
in this context. 



2.3 A Necessary Condition 

Now, we switch our attention to i.i.d. sequences of correlated sources. Let (t/", V"') be 
a pair of i.i.d. (in time) sequences, where each letter of these sequences satisfies a joint 
distribution Puv- Thus, the joint distribution of the sequences is Pu^v" = Puv^ where 
A j^(^k A ^ (g, ^®(fc-i)^ (g, denotes the Kronecker product of matrices [13]. 
From (|T2i) . we know that 

Puv = P^PuvP^ (26) 
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Then, 

i ~ i 1 ~ 1 

We also have Pf/n = P^'^ and Py,. = P^". Thus, 

_ 1 _i 

TD ^ JD 2 JD JD 2 

rjjnyn -I (/n riJnyn JTyn 

_ 2 '^®n 2 p®n 2 '^®n 2^®n 

= Pdv (28) 

Now, applying SVD to P^nyn, we have 

Pc;ny„ = MnKNl = P^^ = M®"A®"(Ar®")^ (29) 

From the uniqueness of the SVD, we know that M„ = M®'", A„ = A®*" and = iV®". 
Then, the ordered singular values of Pu"V" are 

{l,A2(Pt/V'),...,A2(Pc/y),...} 

where the second through the n + 1-st singular values are all equal to X2{Puv)- 

From Theorem [2l we know that if Xi ^ U"' V"^ Xi with n ^ oo^ then, for 
2 = 2,.. . ,min(|A'i|, \Xi\), 

\{PxxX2) ^ A2(PA:i;7")Ai(Pt/nyn)A2(PyiX2) (30) 

We showed above that XiiPu^yn) < \2{Puv) for i > 2, and \i{Punyn) = \2{Puv) for 
2 = 2, . . . ,n + 1. Therefore, for i = 2, . . . ,min(|A'i|, \X2\), we have 

KiPxiX-z) ^ ^2{PxiU") ^2{Puv) ^2{Pv^X2) (31) 

From Theorem [H we know that X2{PxiU") < 1 and A2(PyiX2) ^ 1- Next, in Theorem [3], we 
determine that the least upper bound for \2{PxiU") and A2(Py"X2) is also 1. 

Theorem 3 Let F{n,Pxi) (ill joint distributions for Xi and f/" with a given 

marginal distribution for Xi, Pxi- Then, 

sup A2(PxiC7") = 1 (32) 

F{n,Px^),n=l,2,... 

The proof of Theorem [3] is given in Appendix IB. 11 

Based on the above discussion, we have the following theorem. 
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Theorem 4 If Xi ^ [/" ^ 1/« ^ X2, then, for i = 2, . . . ,min(|A'i|, [A'a]), 

X^{Px,x,) < \2{Puv) (33) 

Theorem H] provides a single-letter necessary condition for the n-letter Markov chain 
Xi f/" V"' X2 on the joint probability distribution p{xi,X2). This theorem also 
answers the questions we posed in Section [H Our first question was whether {Xi,X2) can 
be arbitrarily correlated, when n goes to infinity. Theorem H] shows that {Xi,X2) cannot be 
arbitrarily correlated, as the correlation measures between (Xi,X2), Ai(PxiX2)? are upper 
bounded by, X2{Puv), the second correlation measure of the single-letter sources {U, V). Our 
second question was how much extra correlation (Xi, X2) can gain when n goes from 1 to 00. 
Although we have no exact answer for this question, the following observation may provide 
some insights into this problem. From Theorem [21 we know that, if Xi ^ f/ ^ V X2, 

KiPx.x,) < \iPuv) ^ = 2,...,min(|A'i|,|A'2|) (34) 

Theorem m shows, on the other hand, that, if Xi f/" V"^ X2, 

2 = 2,...,min(|A'i|,|A'2|) (35) 

Therefore, we note that n going from 1 to 00 increases the upper boundj^l for the correlation 
measures Xi{PxiX2) from Xi{Puv) to X2{.Puv) for i = 3, . . . ,min(|A'i|, \X2\)- 

As we mentioned in Section [H the data processing inequality in its usual form [9, p. 32] 
is not helpful in this problem, while our new data processing inequality, i.e.. Theorem [21 
provides a single-letter necessary condition for this ra-letter Markov chain. The main reason 
for this difference is that while the mutual information, J(f/"; V""), the correlation measure in 
the original data processing inequality, increases linearly with n, Xi{Punv")-, the correlation 
measure in our new data processing inequality, is bounded as n increases, and therefore, 
makes the problem more tractable. 

Theorem [4] is valid for all discrete random variables. To illustrate the utility and also the 
limitations of Theorem [H we will study a binary example in detail in Appendix [XI In this 
example, (f/, V) and (Xi, X2) are binary random variables. For this specific binary example, 
we will apply Theorem [H to obtain a necessary condition for the n-letter Markov chain. 
Moreover, the special structure of this binary example will enable us to provide a sharper 
necessary condition than the one given in Theorem [H We will compare these two necessary 
conditions and a sufficient condition for this binary example. 

^In general, these upper bounds are not tight. 
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2.4 Conditional Distributions 



Theorem H] in Section 12.31 provides a necessary condition for joint probability distributions 
p{xi,X2), which satisfy the Markov chain Xi — > f/" — > — > X2. In certain specific 
problems, e.g., multi-terminal rate-distortion problem and multiple access channel with cor- 
related sources, in addition to p{xi, X2), the distributions of (Xi, X2) conditioned on parts of 
the 77,-letter sources may be needed, e.g., p{xi,X2\ui,Vi), p{xi,X2\ui,U2,Vi,V2), etcj^ In this 
section, we will develop a result similar to that in Theorem H] for conditional distributions. 

For a pair of i.i.d. sequences (t/", V"") of length n, we define U_ as an arbitrary subset of 
{Ui,...,Un}, i.e., 

U^{U,„...,U,^}c{U,,...,Un} (36) 

and similarly, 

V = {Vn,...,V,Jc{Vu...,Vn} (37) 

In the following theorem, we propose an upper bound for \i{PxiX2\uv) ^ when Xi — > [/" — > 
— > X2 is satisfied. 

Theorem 5 Let (f/*^, V") he a pair of i.i.d. sequences of length n, and let the random 
variables Xi, X2 satisfy Xi — > [/" — > — > X2. Then, for i = 2, . . . , mind^Yil, \X2\), 

) < HPuv) (38) 
where U C {Ui, . . . , Un} and V C {Vi, ...,14}. 

Proof: We consider a special case of {U_,V_) as follows. We define U_ = {Ui, . . . , Ui} and 
V_ = {Vi, . . . , Vm, VJ+i, . . . , Vi+k-m}- We also define the complements of U_ and V_ as: = 
{Ui, Un}\U and V" ^ {Vi, . . . , K}\Z- If IL and V take other forms, we can transform 
them to the form we defined above by permutations. We know that 

p{xi,X2,u'^,]f\u,v) = p{xi\u^,u,v)p{]f,v'^\u,v)p{x2\v'^,v,u) (39) 
In other words, given U_ = u and V_ = v, {Xi,U_'^,V_^,X2) form a Markov chain. Thus, from 

PxiX2\uv ^ PxiU'=\uvPu''V''\2mPv''X2\uv (^0) 

Furthermore, 

PWVIUV =Pvl U,l ® Pjjl + k-m^ l + k-m ® Pljri ^,V" ^, (41) 

^The reader may wish to consult Sections [3] and [¥] for further motivations to consider conditional proba- 
bility distributions. 
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As mentioned earlier, a vector marginal distribution can be viewed as a joint distribution 
matrix with a degenerate random variable whose alphabet size is equal to 1. Since the rank 
of a vector is 1, from Theorem [Tl the sole singular value of Pv' i„! (and of Prri 4-fc — ^^^| l~\-k — m, I 
is equal to 1. Then, 

\{P]rY^\uv) = HPu-^^_^^,v,i^_^^,) (42) 

Combining (EI]), (iOl), and (iJ]), we obtain 

HPx,x,\uv) < HPuv) (43) 

which completes the proof. ■ 
2.5 General Result 

In Sections 12.31 and 12.41 we proposed necessary conditions for the n-letter Markov chain 
Xi — i> f/" — > — > X2 on p(xi,X2) and p(xi,X2|m;), respectively. With these tools, we 
will develop a general result in this section. We define the set iSxiX2|uv as follows 

5m|uv = {p(xi,a;2|u, v) : — ^ t/" — . X^.n 00} (44) 

where U C {f/i, . . . , Un\ and V C {Vi, . . . , V^}. We may invoke Theorem with (f/, ]/) = 
(U, V) and obtain 

•^uv = {p(xi,X2|u, v) : Ai(PxiX2|uv) < A2(Pc/y),i = 1, . . . ,min(|A:'i|, \X2\)} 

^ '^XiXalUV (45) 

In the following, we use Theorem O with different choices of set arguments to find a set that 
is smaller than 5uv, but still contains 5xiX2|uv- 

We note that for a given source distribution p{u,v), we can obtain p(xi, X2|u', v') (or 
equivalently PxiX2\u'v') for any U' C U and V C V, from the conditional distribution 
p(xi, X2|u, v). Thus, if we define 

Su'v = {p(xi,X2|u, v) : Aj(PxiX2|u'v') < A2(Pj/y),2 = 1, . . . ,min(|A'i|, \X2\)} (46) 

then, by invoking Theorem [5] with (f/jK) = (U', V), we have 

C 5u'v' (47) 

Consequently, if we define 

S'xiX2\fJV — i^u'v (48) 

u'cu,v'cv 
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then, we have 

<SxiX2\UV ^ '^^iXalUV ^ '^uv (49) 

That is, when we need a necessary condition on X2|u, v), even though 5uv provides 
such a necessary condition, we can obtain a smaller probability set and therefore a stricter 
necessary condition by combining the necessary conditions for all p{xi,X2\u',V) where the 
sets U' and V are included in the sets U and V, respectively. 



3 Example I: Multi-terminal Rate-distortion Region 

Ever since the milestone paper of Wyner and Ziv [15] on the rate-distortion function of a 
single source with side information at the decoder, there has been a significant amount of 
efforts directed towards solving a generalization of this problem, the so called multi-terminal 
rate-distortion problem. Among all the attempts on this difficult problem, the notable works 
by Tung [4] and Housewright [5] (see also [6]) provide the inner and outer bounds for the rate- 
distortion region. A more recent progress on this problem is by Wagner and Anantharam 
in [16], where a tighter outer bound is given. A very promising and very recent result can 
be found in [17]. 

The multi-terminal rate-distortion problem can be formulated as follows. Consider 
a pair of discrete memoryless sources {U,V), with joint distribution p{u,v) defined on 
the finite alphabet U x V. The reconstruction of the sources are built on another fi- 
nite alphabet U x V. The distortion measures are defined as c^i : U x U \ — > M+ U {0} 
and 62 '■ V X V 1 — > U {0}. Assume that two distributed encoders are functions 
/i : I — > {1,2,..., Ml} and /2 : V" 1 — > {1,2,..., M2} and a joint decoder is the function 
g : {1, 2, . . . , Ml} x {1,2,..., M2} 1 — > W^x V", where is a positive integer. A pair of 
distortion levels D = {Di,D2) is said to be R-attainable, for some rate pair R, — (/?i,i?2)) 
if for all e > and 6 > 0, there exist, some positive integer n and a set of distributed en- 
coders and joint decoder (/i, /2, g) with rates (^ log2 Mi, ^ logg M2) = (-Ri + 5, R2 + 5), such 
that the distortion between the sources (?7", 1/") and the decoder output (f/", 1/") satisfied 
(Ecii(f/",V"),Erf2(V^",V^")) < {D, + e,D2 + e) where di(f/",f/") ^ MUi, U^) and 

d2{V^,V"') = ^ ^"=1 '^2(Vi, Vi). The problem here is to determine, for a fixed D, the set 
7^(D) of all rate pairs R, for which D is R-attainable. 

3.1 Existing Results 

We restate the outer bound provided in [4] and [5] in the following theorem. 



^By {A,B) < {C,D), we mean both A < B and C < D, and {A,B) < {C,D) is defined in the similar 
manner. 
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Theorem 6 [4, 5] 7^(D) C 7lout,i(D), where 7lout,i(D) is the set of all R such that there 
exists a pair of discrete random variables {Xi,X2), for which the following three conditions 
are satisfied: 



1. The joint distribution satisfies 



Xi^U (50) 
U ^X2 (51) 



2. The rate pair satisfies 



Ri>I{U,V;Xi\X2) 
R2>I{U,V;X2\X,) 
Ri + R2> I{U,V;Xi,X2) 



(52) 
(53) 
(54) 



3. There exists {U{Xi, X2),V{Xi, X2)) such that {Edi{U,U), Ed2{V,V)) < D. 
An inner bound is also given in [4] and [5] as follows. 

Theorem 7 [4, 5] 7^(D) D 7?.j„(D) . where 7^j„(D) is the set of all R such that there exists 
a pair of discrete random variables {Xi,X2), for which the following three conditions are 
satisfied: 

1. The joint distribution satisfies 

Xi^U ^ X2 (55) 



2. The rate pair satisfies 

Ri>I{U,V;Xi\X2) (56) 

R2>I{U,V;X2\X,) (57) 

Ri + R2>I{U,V;Xi,X2) (58) 

3. There exists {U(Xi, X2),V(Xi, X2)) such that {Edi(U,U), Ed2(V,V)) < D. 

We note that the inner and outer bounds agree on both the second condition, i.e., the rate 
constraints in terms of some mutual information expressions, and the third condition, i.e., the 
reconstruction functions. However, the first condition in these two bounds constraining the 
underlying probability distributions p{xi,X2\u,v) are different. It is easy to see that the 
Markov chain condition in the inner bound, i.e., Xi ^ U — > V" — > X2, implies the Markov 
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chain conditions in the outer bound, i.e., Xi —>[/—> y and U — > y — > X2. Hence, if we 
define 

Sout,i = {p(xi, X2\u,v) : X,^U andU ^ X2} (59) 

Sin = {p{xi, X2\u,v):X,^U^V^ X2} (60) 



then. 



Sin Q Sout,i (61) 



Using the time-sharing argument, a convexification of the inner bound TZin(D) yields another 
inner bound 7l[^(D), which is larger than 7?.j„(D). This new inner bound may be expressed 
as a function of <Sj„ and D as follows, 

TZiniB) C 7^;„(D) = J^iSin, D) C 7^(D) (62) 

where, using a time sharing random variable Q, which is known by the encoders and the 
decoder, J- {Sin, D) is defined as, 

J^{Sin,-D)^ y C(p) (63) 
p =p{xi, X2, q\u, v) = pg{xi,X2\u, v)p{q) (64) 

Pq{Xi,X2\u,v) G Sin, 

V{S,n,Ti)^{^: 3(f/(Xi,X2,g),V^(Xi,X2,g)), ) (65) 
s.t. {Edi{UM),Ed2{VX)) <T> 

R,>I{U,V;Xi\X2,Q) 
C{p)^{{R^,R2): R2 > IiU,V; X2\X,,Q) } (66) 
Ri + R2> I{U,V;X^,X2\Q) 

Prom the definition of the function J^, we can see that is monotonic with respect to the 
set argument when the distortion argument is fixed, i.e., 

jr(A,D) c :r(s,D), if Acs (67) 

In [5], it was shown that 7lout,i(D) is convex. Thus, TZout,i(D) can be represented in 
terms of function as well, i.e., 

7^o„t,l(D) =^(5o„i,i,D) (68) 
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The result by Wagner and Anatharam [16] can also be expressed by using the function 



T a! 
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7^o«^,2(D) =^(5o„i,2,D) (69) 

where 

Sout,2 = {p{xi,X2\u,v) : 3w,p{xi,X2,w\u,v) = p{w)p{xi\w,u)p{x2\w,v)} (70) 

The distribution in (I70|) may be represented by the following Markov chain like notation 

Xi X2 

\ / (71) 

w 

We note that 

^in ^ <Sout,2 ^ <Sout,l (72) 

Therefore, we conclude that the gap between the inner and the outer bounds comes only 
from the difference between the feasible sets of the probability distributions p{xi,X2\u, v). In 
the next section, we will provide a tighter outer bound for the rate region in the sense that 
it can be represented using the same mutual information expressions, however, on a smaller 
feasible set for p{xi,X2\u,v) than 7lout,2(D)- 

3.2 A New Outer Bound 

We propose a new outer bound for the multi-terminal rate-distortion region as follows. 

Theorem 8 7?.(D) C TZout,2(D), where TZout.2(D) is the set of all R such that there exist 
some positive integer n, and discrete random variables Q,Xi,X2 for which the following 
three conditions are satisfied: 

1. The joint distribution satisfies 

n 

p(m",?;", xi, X2, q) = p{q)p{xi\u"- , q)p{x2\v'', g) ]^p(ui, fj) (73) 

i=l 

2. The rate pair satisfies 

Ri>I{UuVi;X^\X2,Q) (74) 
R2>I{Ui,V,;X2\X,,Q) (75) 
Ri + R2>I{UuVi;X,,X2\Q) (76) 



^This is a simplified version of [16] with the assumption that there is no hidden source behind ([/", V") 
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where (f/i, Vi) is the first sample of the n-sequences (f/", V"-). 
3. There exists {U{Xi, X2,Q),V{Xi, X2,Q)) such that {Edi{U,U), Ed2{V,V)) < D. 
or equivalently, 

7^out,3(D) =^(5„„t,3,D) (77) 

where 

Sout,3 = {p{xi, X2\ui, V,) : Xi ^ t/" ^ ^ X2} (78) 

Proof: We consider an arbitrary triple (/i,/2,fi') of two distributed encoders and one joint 
decoder witli reconstructions {U"',V"') = g{Y,Z), where Y = fi{U"') and Z = f2{V"'), such 
that the distortions satisfy (^di(f/", f"), Erf2(^", V"")) < (^1 + e,D2 + e). Here, we use 
i?i = ilog2(M0 = Mog^dri) and R2 = ^log2(M2) = ilog^d^l). 

We define the auxihary random variables Xu = {Y,W^^) and X2i = {Z,V'^~^). Then, we 
have 

log2(Mi) > H{Y) 

= /([/", y";y) 

> I{U'\V";Y\Z) 

n 

= J2HUi,Vi;Y\Z,W-\V'-') 

i=l 
n 

^Y,I{U„Vf,Y,Z\U'-\V'-') 

i=l 
n 

1 J]/([/,,l-,;F,Z|t/^-\y-i) 
1=1 

n 

^Y.I{Ui,Vi-Y,Z,U^-'\V^-') 

i=l 
n 

i=l 
n 

^J2l{Ui,Vf,Y,W-'\Z,V'-') 

i=l 
n 

= J] /([/,, y,;XH|X2,) 

i=l 

where 

1. follows from the fact that Y U"' — > V"^ Z. We observe that the equality holds 
when Y is independent of Z; 



-I{Ui,V,;Z\U'-\V'-') 
-I{Ui,Vi-Z\V'-') 

- I{Ui, Vi- u'-^\v'-^) - liUi, Vi- z\v'-') 
-I{Ui,V-,Z\V'-') 

(79) 
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2. follows from the fact that 



p{z\ui,Vi,v^ ^) ^ p{z\ui,Vi,u'- ^) (80) 

3. follows from the memory less property of the sources. 
Using a symmetrical argument, we obtain 

n 

log2(M2) > J2HUi,Vi;X2i\Xu) (81) 

i=l 

Moreover, 

log2(MiM2) >H{Y,Z) 

=/(t/", Y, Z) 

n 

i=l 

n 

= J]/(^7„V,;Xh,X2,) (82) 
1=1 

We introduce a time-sharing random variable Q, which is uniformly distributed on 
{1, . . . ,n} and independent of U"' and Let the random variables Xi and X2 be such 
that 

p{xu,X2i\ui,Vi,Ui,Vi) = p{xi,X2\ui,vi,ul,vl,Q = i) (83) 
where Uf = {C/i, . . . , C/j+i, . . . , [/„} and Fj'^ is defined similarly. Then, 

n 

Y,I{Ui,Vi;Xu\X2i) ^nI{U,,V^;X^\X2,Q) (84) 

i=l 
n 

J]7(C/,,y,;X2i|Xi,) =n/(C/i,14;^2|^i,g) (85) 

j=l 

5^/(t/„l^,;Xi„X2.) =n/(t/i,V^i;Xi,X2|g) (86) 

i=l 

The reconstruction pair (t/, V^) is defined as follows. When Q = i, {U,V) = {Ui,Vi), 
i.e., the i-th letter of (f/", V^") = g{Y,Z). {Ui,Vi) is a function of {Y,Z), and, therefore, 
it is a function of {Xi, X2, Q). Hence, we have that {U,V) is a function of {Xi,X2,Q), 
i.e., (t/(Xi, X2, Q), t^(-^i, -^2, <3)) • It is easy to see that 

{Ed,{U, U), Ed2{V, V)) = {Ed,{U^, V^"), Ed2{V^, V'')) < {D^ + e, L>2 + e) (87) 
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which completes the proof. ■ 

Next, we state and prove that our outer bound given in Theorem [8] is tighter than 
TZoutA'D) given in (EHl). 

Theorem 9 

7^o«^,3(D) C 7^o„^,2(D) (88) 

Proof: Here, we provide two proofs. First, we prove this theorem by construction. For every 
(-Ri,i?2) point in 7lout,3(D), there exist random variables Q,Xi,X2 satisfying fl73l) . {Ri,R2) 
pair satisfying ( 1741) . ( 1751) and ( 1761) . and a reconstruction pair (f/(Xi, X2, Q), V(Xi, X2, Q)) 
such that {Edi{U, U), Ed2{V, V)) < D. According to [5], let X[ = (Xi, Q) and X^ = (X2, Q). 
Then, p{x[,x'2\ui,Vi) belongs to set Sout,2- Moreover, 

Ri > liU, V- X1IX2, Q) = liU, V- X[\X'2) (89) 

and similarly, 

R2 > liU, V- X2IX1, Q) = I{U, V; X!,\X[) (90) 

and finally, 

Ri + R2> IiU,V;Xi,X2\Q) 

= H{U,V\Q) - H{U,V\Xi,X2,Q) 
= H{U,V)-H{U,V\X,,X2,Q) 
= H{U,V)-H{U,V\X[,X!2) 

= IiU,V;X[,X'2) (91) 

where 1. follows from the fact that Q is independent of {U,V). {U,V) is a function of 
(Xi,X2,Q), and, therefore, it is a function of (X(,X2) = ((Xi,(5), {X2,Q)). 

Hence, for every rate pair (i?i, R2) G 7^out,3(D), there exist random variables X{, Xg such 
that p{x[,x'2\ui,Vi) G Sout,2, (-^1,-^2) pair satisfies the mutual information constraints, and 
the reconstruction satisfies the distortion constraints. In other words, {Ri, R2) G 7^oMi,2(D), 
proving the theorem. 

An alternative proof comes from the comparison of Sout,2 and 5o«t,3; the feasible sets of 
probability distributionj^ p(xi, a;2|ui, fi). We note that Xi ^ V'"' —>■ X2 implies the 

Markov chain like condition in fl7T|) . which means that 

<Sout,3 ^ <Sout,2 (92) 

^In Sout,2, the probability distribution is p(xi,X2\u,v). Here, we just rename U — Ui and V — V\. 
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and because of the monotonic property of D) in (l67|l . we have 



T{Sout,^, D) = 7^„„^,3(D) C 7^o„i,2(D) = r{Sout,2. D) (93) 



3.3 A New Necessary Condition 

From the proof of Theorem [8], we note that (Xij,X2i) satisfies an n-letter Markov chain 
constraint Xu — > t/" — > V" — > X2i. From the discussion in Section [2751 we know that if 
the random variables Xi and X2 satisfy Xi — > U"' — > V"' — > X2, then, 



HPx 

K{PxxX2\vi) < M{Puv) 

HPx 



1,- 



1,- 
1,- 



. . ,min(|A'i|, \X2\) 
. . ,min(|A'i|, \X2\) 
. . ,min(|A'i|, \X2\) 
. . ,min(|A'i|, jA'sl) 



(94) 
(95) 
(96) 
(97) 



or equivalently 



where 



<Snut.H ^ <S, 



outA 



(98) 



<Sout,4 —{p{xi, X2\ui,Vi) : (|9^ . fl95l) . fl96|) . and fl97j) are satisfied} 



(99) 



Thus, we have the following theorem 

Theorem 10 7?.(D) C 7lout,4:(D), where TZout,4:(D) is the set of all R such that there exist 
discrete random variable Q independent of{U,V), and discrete random variables Xi,X2 for 
which the following three conditions are satisfied: 

1. The joint distribution satisfies, 



K{PxxX2\q) ^ ^2{Puv) 

HPx 

HPx iX2\vq) ^ ^2(yPuv) 
HPxiX2\uvg) < ^2{Puv) 



1,- 
1,. 
1,. 



. ,min(|A'i|, lA'al) 
. ,min(|A'i|, \X2\) 
. ,min(|A'i|, \X2\) 
. ,min(|A'i|, jA'al) 



(100) 
(101) 
(102) 
(103) 
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2. The rate 'pair satisfies 

Ri>I{U,V-X^\X2,Q) (104) 

R2>I{U,V-X2\Xi,Q) (105) 

Ri + R2>I{U,V-Xi,X2\Q) (106) 

3. There exists (?7(Xi, X2, Q), 1/(Xi, X2, Q)) such that {Edi{U,U), Ed2{V,V)) < D. 
Equivalently, 

noutACD) = J'{SoutA,'D) (107) 

From Section [231 we have that 

<Sout,3 ^ <Sout,4: (108) 

and therefore 

7^o«^,3(D) = J'{Sout,3, D) C 7^o„^,4(D) = ^(5„„t,4, D) (109) 
From Theorem [9l we know that 

and 

7^o«t,3(D) = ^(5o„t,3, D) C 7^o„^,2(D) = ^(5o„t,2, D) (111) 

So far, we have not been able to determine whether Sout,i ^ Sout,2 or Sout,2 ^ '5o„t^4, how- 
ever, we know that there exists some probability distribution p{xi,X2\ui,vi), which belongs 
to Sout,2, but does not belong to Sout,'i- For example, assume X2{Puv) < 1 and some random 
variable W independent to {U,V). Let Xi = {fi{Ui),W) and X2 = {f2{Vi),W). We note 
that (Xi, X2, f/i, Vi) satisfies the Markov chain like condition in (17T1) . i.e., p{xi,X2\ui,Vi) e 
<Sout,2- But, (Xi,X2) contains common information W, which means that X2{PxiX2) = 1 > 
X2{Puv) [10], and therefore, p{xi,X2\ui,vi) ^ Sout,4:- Based on this observation, we note 
that introducing Sout,4 helps us rule out some unachievable probabihty distributions that 
may exist in Sout,2- The relation between different feasible sets of probability distributions 
p{xi,X2\ui,Vi) is illustrated in Figured! 

Finally, we note that we can obtain a tighter outer bound in terms of the function JF(-, D) 
by using a set argument which is the intersection of Sout,2 and Sout,4., i-e., 

T^out,2n4(D) — J^i<Sout,2 n Sout,4, D) (112) 

It is straightforward to see that this outer bound 'R-out,2nA(D) is in general tighter than the 
outer bound J^{Sout,2, D). 
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Figure 1: Different sets of probability distributions p{xi,X2\u,v). 



4 Example II: Multiple Access Channel with Corre- 
lated Sources 

The problem of determining the capacity region of the multiple access channel with correlated 
sources can be formulated as follows. Given a pair of i.i.d. correlated sources (C/, V) described 
by the joint probability distribution p{u,v), and a discrete, memoryless, multiple access 
channel characterized by the transition probability p{y\xi,X2), what are the necessary and 
sufficient conditions for the reliable transmission of n samples of the sources through the 
channel, in n channel uses, as n — > oo? 

4.1 Existing Results 

The multiple access channel with correlated sources was studied by Cover, El Gamal and 
Salehi in [7] (a simpler proof was given in [8]), where an achievable region expressed by 
single-letter entropies and mutual informations was given as follows. 

Theorem 11 [7] A source {U,V) with joint distribution p{u,v) can be sent with arbitrarily 
small probability of error over a multiple access channel characterized by p{y\xi,X2), if there 
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exist probability mass functions p{s), p{xi\u, s), p{x2\v, s), such that 



H{U\V)<I{X^■,Y\X,,V,S) 
H{V\U)<IiX2;Y\Xi,U,S) 



(113) 
(114) 

(115) 
(116) 



H{U,V\W) < IiX,,X2]Y\W,S) 
H{U,V)<I{X,,X2;Y) 



where 



p{s, u, v,xi, X2, y) = p{s)p{u, v)p{xi\u, s)p{x2\v, s)p{y\xi, X2) 



(117) 



and 



w = f{u) = g{v) 



(118) 



is the common information in the sense of Witsenhausen, Gacs and Korner (see [10]). 

The above region can be simplified if there is no common information between U and V as 
follows [7] 



This achievable region was shown to be suboptimal by Dueck [18]. 

Cover, El Gamal and Salehi [7] also provided a capacity result with both achievability 
and converse in the form of some incomputable n-letter mutual informations. Their result 
is restated in the following theorem. 

Theorem 12 [7] The correlated sources {U, V) can be communicated reliably over the dis- 
crete memoryless multiple access channel p{y\x i,X2) if and only if 



H{U\V) < I{X^■Y\X2,V) 
H{V\U) < I{X2;Y\X,,U) 
H{U,V)<I{X,,X2;Y) 




(120) 
(121) 



where 



p{u, V, Xi, X2, y) = p{u, v)p{xi\u)p{x2\v)p{y\xi, X2) 



(122) 



00 




(123) 



n=l 



where 




R^<ll{X^-Y^\X^,V^) ^ 
R2<ll{Xl^;Y-\X^,U^) > 
Rs<^I{X-,X-;Y-) ^ 



(124) 
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for some 

n n 

^(m",^", x^, I/") = p{x'^\u")p{x^\v'') Y[p{ui, Vi) Y[p{yi\xii, X2i) (125) 



i=l i=l 



i.e., for some X" and that satisfy the Markov chain X" —>•[/"■—>• Xl^. 

Some recent results on the transmission of correlated sources over multiple access channels 
can be found in [19,20]. 

4.2 A New Outer Bound 

We propose a new outer bound for the multiple access channel with correlated sources as 
follows. 

Theorem 13 // a pair of i.i.d. sources {U,V) with joint distribution p{u,v) can be trans- 
mitted reliably through a discrete, memoryless, multiple access channel characterized by 
p{y\xi,X2), then 

H{U\V)<I{X^;Y\X2,U,Q) (126) 
H(V\U) < I(X2;Y\X,,V,Q) (127) 
H{U,V)<I{X^,X2;Y\Q) (128) 

where random variables Xi, X2 and Q are such that 

n 

p(xi,X2, y, -u", t;", q) = p(q)p(xi\u'', q)p{x2\v'', q)p(y\xi, X2) Ylp(ui, Vi) (129) 

i=l 

where {U"',V"') are n samples of the i.i.d. sources with n — > 00, U C {C/i, . . . , [/„} and 
V C {Vi, . . . , Vn} and both U and V contain finite number of elements. 

Proof: Consider a given block code of length n with the encoders /i : 1 — and 
/2 : V" I — > X2 and decoder g : 1 — > W x V". Prom Fano's inequahty [9, p. 39], we have 

//([/", y"|r") < nlog2 \U X V\Pe + 1 = ne„ (130) 
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Let Gi be a permutation on the set {1, . . . ,n} (similarly on the set {?7i, . . . , ?7„}, and 
{Vi,...,Vn}). We defin(H 



U, ^ {G,(f/fc) : f/, G U} 
V, ^ : e V} 



(131) 
(132) 



This definition provides that p(uj, Vj), the joint probabilities of Uj and Vj, are identical for 
z = 1, . . . , n. 

For a code, for which Pg — 0, as n — oo, we have e„ 0. Then, 

n/7(t/|V) = H{U''\V'') 

= liU""; y"|1/") + i/([/"|F", 1/") 

< /(f/"; r^iv'") + //([/", y"|r") 

< /(f/";F"|F") + ne„ 
= i7(F"|V") - H{Y''\U'\ V") + ne„ 
= /7(F"|X2", \/") - /7(F"|Xf , X2", f/", y") + 
= if (F" 1X2", V^'^) - H{Y''\X^, X2") + ne„ 

n 

1=1 

5 " 

< Yl [HiY,\X2^,V^) - H{Y,\Xu,X2^ 

i=l 
n 

^ [h{Y,\X2,, V,) - H{Y,\Xu. V 

i=l 
n 

= ^/(XH;Fi|X2i,Vi) +ne„ 



(133) 



1=1 



where 



1. from Fano's inequality in (11301) : 

2. from the fact that X" is the deterministic function of and X2 is the deterministic 
function of V^"; 

3. from pd/^lx", X2, m", f") = p(?/"|x", Xg); 

4. from the chain rule and the memoryless nature of the channel; 



lOPor example, if we let U = {C/i, C/2} and V = {Fi, ^2} and Gi (1) = 3andGi(2) = 5, then, Ui = {U:i,U^) 
and Vi = {^3,^5}. 
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5. from the property that conditioning reduces entropy; 

6. bom p{yi\xu,X2i,Vi) ^ p{yi\xu,X2i). 
Using a symmetrical argument, we obtain 



nH{V\U) < ^/(X2,;F,|Xh,U,) +ne, (134) 



Moreover, 



nH{U, V) = HiU"", 

= I{U", r") + /7(f/", 

< J(f/", V"; F") + nen 

< I(X[\ X^; F") + ne„ 

= H{Y") - HiY^'lX^, X^) + ne„ 

n 

= [h{Y,\Y'-^) - H{Y,\Xu,X2i) 

i=l 
n 

<J2[H{Yi) - H{Yi\Xu,X2i) 



+ n€r. 



1=1 



^Y,I{Xu,X2i;Yi} + nen (135) 



i=l 



We introduce a time-sharing random variable Q [9, p. 397] as follows. Let Q be uniformly 
distributed on {1, . . . , n} and be independent of C/", V^. Let the random variables Xi and 
X2 be such that 



where 



p{xu, X2i\Ui, Vj,u., V.) = p{xi,X2\u, V, Q = i) (136) 



W^{Ui,...,Un}\U (137) 

V^^{yi,...,K}\V (138) 

^ {^.(f/fc) : e V^} (139) 

V,^ ^ {G,(Vfe) : Vfc e V^} (140) 
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Then, 

n 

^/(Xh;F,|X2„V,) =n/(Xi;r|X2,V,Q) (141) 

i=l 
n 

HX2^■, Yi\X,„ U,) = n/(X2; Y\X,, U, Q) (142) 

i=l 

n 

^2^; Y,) = nI{X,, X2; Y\Q) (143) 

1=1 

Combining ffTiTD . ffM and ffM with ffT33D . ffTMD and ([1351) completes the proof. ■ 
4.3 A New Necessary Condition 

It can be shown that the outer bound in Theorem [13] is equivalent to the following 

HG7^(5)=co| U 7^(p)} (144) 

where 

H ^ [i/(t/|\/), i/(\/|t/), iJ(t/, \/)] (145) 
p = p{xi,X2\u,v) (146) 
5x,x,|uv = {p : Xi ^ f/" ^ ^ X2, n ^ 00} (147) 
r i?i</(Xi;r|X2,V) 1 

7^(p) = i?2, R3] : i?2 < /(X2; Y\Xi, U) i (148) 

[ Rs<I{Xi,X2;Y) J 

and co{-} represents the closure of the convex hull of the set argument. 
From Section 12. 5[ we know that 

<SxiX2\uv ^ <S'xiX2\uv — Pi '^u'v (149) 

U'CU.V'CV 

where 

5u'V' = {p(xi,X2|u, v) : Ai(PxiX2|u'v') < >^2{Puv),i = 1, • • . ,min(|A'i|, \X2\)} (150) 

Then, we obtain a single-letter outer bound for the multiple access channel with correlated 
sources as follows. 

Theorem 14 // a pair of i.i.d. sources {U,V) with joint distribution p{u,v) can be trans- 
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mitted reliably through a discrete, memoryless, multiple access channel characterized by 
p{y\xi,X2), then 



H{U\V)<I{X,-Y\X2,Y,Q) (151) 
H{V\U)<I{X2-Y\X^,\J,Q) (152) 
H{U,V)<I{Xi,X2,Y\Q) (153) 

where U C {f/i, . . . , and V C {Vi,...,V^} are two sets containing finite letters of 
source samples, random variable Q independent of (U, V), and for random variables Xi, 
Xi, p(xi, X2|u, V, q) such that, for any U' C U and V C V, 

A.(PxiX2|u'v',) < A2(Pt/y), ^ = 1, . . . , min(|A'i|, \X2\) (154) 

Equivalently, 

HG7^(5')=co| U 7^(p)} (155) 

P£'5'x]^X2|uv 

In the rest of this section, we will specialize our results to the case where we choose 
U = \U\\ and V = {Vi}. Here, we have the following definitionJ"] 

Sout.z = SxiX2\UiVi = {p{xi,X2\ui,vi) : Xi — > [/" — > 1/" — > X2} (156) 

and 



<Sout,4. — 1S0 n n n Su^vi (157) 



where 



50 = {p{xi,X2\ui,Vi) : Xi{PxiX2) < >^2{Puv)} (158) 

<Sui = {pixi,X2\ui,Vi) : \iiPxiX2\ui) < >^2{Puv)} (159) 

<Svi = {pixi,X2\ui,Vi) : Ai(PxiX2|i>i) < A2(Pi/y)} (160) 

'^c/iVi = {p{xi,X2\ui,Vi) : Ai(PxiX2|«ii)i) < A2(-Pc/y)} (161) 

We note that when U = {Ui} and V = {Vi}, the expressions in (I148P agree with 
those in the achievability scheme of Cover, El Gamal and Salehi when there is no common 
information, i.e., f lllQp . fll20p . and f ll2ip . Thus, the gap between the achievablity scheme of 
Cover, El Gamal and Salehi, and the converse in this paper results from the fact that the 
feasible sets for the conditional probabihty distribution p = p{xi,X2\u,v) are different. In 



^^The notation Sout,3, as well as Sout,4 and Sin in the sequel, is used in order to be consistent with the 
notations in Section [31 
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the achievability scheme of Cover, El Gamal and Salehi, p belongs to 



Sirr = {p{xi, X2\u,v)■.X^^U^V^ X2} (162) 

since for the achievability, we need Xi — > U — > V — > X2. Whereas, in our converse, 
P e S^t,z C SoutA- Since X^ ^ U ^ V ^ X^ implies X^ — ^ [/" — ^ V" X2 
and Xi f/" — ^ \/« ^ X2 implies \i{Px,x,) < X2{Puv), Xi{Px,x,\uJ < HPuv), 
Ai(PxiX2bi) < ^2{Puv), and \i{Px^X2\mv^) < >^2iPuv), we have 

Sin ^ Sout,3 ^ Smit,A (163) 

Therefore, when m = 1, even though the mutual information expressions in the achiev- 
ability and the converse are the same, their actual values will be different, since they will 
be evaluated using the conditional probability distributions that belong to different feasible 
sets. 



5 Conclusion 

In the distributed coding on correlated sources, the problem of describing a joint distribution 
involving an n-letter Markov chain arises. By means of spectrum analysis, we provided a 
new data processing inequality based on a new measure of correlation, which gave us a 
single-letter necessary condition for the n-letter Markov chain. We applied our results to 
two specific examples involving distributed coding of correlated sources: the multi-terminal 
rate-distortion region and the multiple access channel with correlated sources, and proposed 
two new outer bounds for these two problems. 



Appendices 

A An Illustrative Binary Example 

In this section, we will study a specific binary example in detail. The aims of this study are, 
first, to ilustrate the single- letter necessary condition we proposed for the n-letter Markov 
chain in Section 12.31 second, to develop a sharper necessary condition in this specific case, 
and finally, to compare different necessary conditions and a sufficient condition in this specific 
example. 

The binary example under consideration is as follows. Let f/, V ^ Xi and X2 be binary 
random variables, which take values from {0, 1}. We assume that ([/, V) are a pair of binary 
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symmetric sources, i.e., 



Pr{U = 0) = PriU = 1) = PriV = 0) = Pr(\/ = 1) 



1 
2 



From ffT2D and ffTSl). we have 



1 
1 



1 1 

V2 V2 



+ HPuv)MPuv)l^2{Puvf 



Here we focus on the symmetric case, i.e., 

fJ'2{Puv) = MPuv) = 

In addition, we assume the following marginal distributions for Xi and X2, 



1 

1 

V2. 



PX2 



a 

1-62 



where < a, 6 < 1. Then, from (fT2|) and (fT5|) . we have 



X2 



b + X2{Px,X,)MPx,X,)iy2{Px,X,) 



We note that 



^2(^X1X2) 1^2(^X1X2) = 



-b 



where a E {1,-1}. For the simplicity of the derivation in the sequel, we let A = cr 
Then, we have 



P 



X1X2 



a 



b v^r^ 



+ A 



-b 



From Theorem [H we know that the entries of PxiX2 ^'^'^ non-negative, i.e.. 



P 



Xi X2 



a6 + Av/(l-a2)(l-62) ay/T^ - Xby/T^ ' 
bVT^-Xa^/T^ y/{l-a^){l-b'^) + Xab 



> 
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which imphes that 

-6<A<ei (173) 

where 

^ min(a^ &^) mm(l - 1 - &^) ^ ^ 

aV(l-«')(l-^') 

^ min(l - a^, b"^) mm{a^, 1 - 6^) 

42 = , < 1 (l'5j 

aV(l-a2)(l-62) 

From Theorem HJ we have 

-A2(Pj/y)<A<A2(Pc/y) (176) 

Thus, from above, we have 

— min( 

A sharper bound in this special case can be obtained as follows. 



1(6, HPuv)) < A < min(6, A2(Pj/y)) (177) 



Theorem 15 If Xi — > f/" — > V — > X2, and (Xi, X2, ?7", 1/") satisfies the above set- 
tings, then for sufficiently large n, 

- min (^6, UPuv)^-^^ < A < min (^{i, UPuv)^-^^ (178) 

The proof of Theorem [15] is given in Appendix IB. 21 

The bound in fll78p is tighter than the one in fll77p because ^ < 1 and therefore < 1. 
A similar argument holds for the other side of the inequality as well. 

In the above derivation, we provided two necessary conditions for the n-letter Markov 
chain Xi — > f/" — > V"" — > X2, where n ^ 00, in this special case of binary random 
variables. In other words, we provided two outer bounds for A, where the joint distributions 
p(a;i, X2, m", f ") satisfy the n-letter Markov chain Xi — > f/" — > V"^ — > X2 with n —* 00 
and satisfy the fixed marginal distributions given in (I167P and (11680 . 

For reference, we give a sufficient condition for Xi — > — > X2, or equivalently, 

an inner bound for A satisfying this ra-letter Markov chain. This inner bound is obtained by 
noting that if (Xi, X2) satisfies Xi ^ U ^ V X2, then it satisfies Xi ^ ^ V X2. 
In this case, using Theorem [1] we have 

A = XLX2iPuv)>^R (179) 
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where Xl and Xr are such that 



P 



XiU 



1 1 

V2 



Pi 



VX2 



1 
1 

L V2 



b 



+ Xl 

+ Xr 



VT 



1 

1 

V2 



1 ]_ 



> 



> 



Due to the non- negativity of the matrices PxiU and Pvx2i we have 



minfa^, 1 — a?) 



< Ar < 



minfa^, 1 — a?) 



a\/\ — a? 



min(6^, 1 — 6^) min(6^, 1 — 6^) 

< Xr< 



Thus, we have 



where 



6 



- X2{Puv)^3 < ^ < HPUV)^3 

minfa^, 1 — a^) minffe, 1 — b"^) 



180) 
;i81) 



(182) 
(183) 

(184) 
(185) 



Then, combining (11771) . fll78p . and fll84p . we have the two outer bounds and one inner bound 
for A as follows 

X2{Puv)^3 < sup A < min(ei, X2{Puv)^-^) < min(ei, X^iPuv)) (186) 
-min(6,A2(Pc7y)) < - min(6, A2(Pc/y)^^) < ^ ^ inf A < -X2{Puv)^3 (187) 

We illustrate these three bounds with X2{Puv) = 0.5 in Figure [2j 



B Proofs of Some Theorems 
B . 1 Proof of Theorem [3] 

To find sup X2{PxiW^), we need to exhaust the sets F{n,Pxi) with n > 1. In the 

F(n,Pxi),n=l,2,... 

following, we show that it suffices to check only the asymptotic case. 

For any joint distribution PxiU" ^ F{n, Pxi), we attach an independent U, say Un+i, to 
the existing n-sequence, and get a new joint distribution PxiC/^+i = PxiU" ®Pu, where pu is 
the marginal distribution of U in the vector form. By arguments similar to those in Section 



12. 4[ we have that Ai(Pxii/"+i) = Ai(Pxi(7")- Therefore, for every Px^U" £ F{n,Px^), there 
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outer bound 1 



outer bound 2 



inner bound 




Figure 2: (i) Outer bound 1, (ii) outer bound 2, and (iii) inner bound for A. 
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exists some Px^w^+i ^ F{n+ l,Pxi), such that Aj(PxiC/"+0 = K{PxiU^)- Thus, 



sup A2(Pxi(7") < sup A2(PxiC/"+0 
Fin,Px^) F{n+l,Px-,) 



(188) 



From fll88p . we see that sup A2(-Pxi;7'0 is monotonically non-decreasing in n. We also 

F{n,Px^) 

note that \2{PxiU") is upper bounded by 1 for all n, i.e., X2{PxiU") < 1- Therefore, 



sup A2(PxiC/" 
F{n,Px-,),n=l,2,... 



lim sup A2(Pxii/" 



;i89) 



To complete the proof, we need the following lemma. 



Lemma 3 [10] X2{Pxy) = 1 if and only if Pxy decomposes. By Pxy decomposes, we mean 
that there exist sets Si E X , S2 & y , such that P{Si), P{X — Si), P{S2), P{y — S2) are 
positive, while P{{X - Si) x ^2) = P{Si x {y - S2)) = 0. 

In the following, we will show by construction that there exists a joint distribution that 
decomposes asymptotically. 

For a given marginal distribution P^i , "we arbitrarily choose a subset 5*1 from the alphabet 
of Xi with positive P{Si). We find a set 5*2 in the alphabet of f/" such that P{Si) = P{S2) 
if it is possible. Otherwise, we pick ^2 with positive P{S2) such that |-P(5'i) — P{S2)\ is 
minimized. We denote C{n) to be the set of all subsets of the alphabet of f/" and we also 



define Prr 



max Pr(s) for all s G W. Then, we have 



min |P(^2)-P(5i)|<Pl 

S2CC{n) 



(190) 



We construct a joint distribution for Xi and f/" as follows. First, we construct the joint 
distribution P* corresponding to the case where Xi and U"' are independent. Second, we 
rearrange the alphabets of Xi and f/" and group the sets Si, Xi — Si, S2 and — ^2 as 
follows 



-'11 -'12 

pi pi 
-'21 -'22 



(191) 



where Pf^, PI2, P21, P22 correspond to the sets Si x S2, Si x (W" — 5*2), {Xi — Si) x 5*2, 
{Xi — Si) X (U" — S2), respectively. Here, we assume that P{S2) > P{Si). Then, we scale these 
four sub- matrices as Pn = P12 = 0, P21 

and let 

' Pu 
P21 P22 



Pj2(l-P(S2)) 
(l-P(5l))(l-P(S2))' 



(192) 



We note that P is a joint distribution for Xi and f/" with the given marginal distributions. 
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Next, we move the mass in the sub-matrix P21 to Pn, which yields 



P' 



>n 


= P + E = 


' Pll 







' Eu " 




+ 


_ P22 _ 




. ^21 


P22 


_ -^21 _ 



(193) 



where E21 = P21, En = ^"p^gf)p(g^^f'" , and P[^ = We denote P^^ and P^„ as 

the marginal distributions of P'. We note that P^„ = Pun and P^^ = PxiM where M is a 
scaling diagonal matrix. The elements in the set Si are scaled up by a factor of ^|f^, and 
those in the set Xi — Si are scaled down by a factor of |^Ip|g^^ ■ Then, 



P' = M'^P + M-^^P^^EPjj;^ 



(194) 



We will need the following lemmas in the remainder of our derivations. Lemma [5] can be 
proved using techniques similar to those in the proof of Lemma H] [21]. 

Lemma 4 [21] If A' = A + E, then \Xi{A') - Xi{A)\ < \\E\\2, where \\E\\2 is the spectral 
norm of E. 

Lemma 5 If A' = MA, where M is an invertihle matrix, then ||M~^||2^ < \i{A') / Xi{A) < 
IIMII2. 

Since P' decomposes, using Lemma [31 we conclude that A2(P') = 1. We upper bound 

_ 1 _ 1 

I \Px'^ EP^n 1 12 as follows. 



I I PX^ EP^^ I I 2 < II Px^ EPjjn 1 1 F 

where || ■ ||ir is the Frobenius norm. Combining fll9ip and (11931) . we have 



\p-hFp~h\ ^ {P{S2)-P{Si)) . 



P[PiS2 



TD 1 r> 211 



(195) 



(196) 



where P[ = min(P(5'i), 1 — P(S'i)). Since P* corresponds to the independent case, we have 
||P^^^P*Pf;i||p = 1 from ([IS]). Then, from f[T90|) . f[T95D and f[T96|l . we obtain 

||P^^^EP^i||2<CiP^,,. 

where ci ^ ^r^y 

From Lemma [21 we have 



\M~-2Px;'EPu,f 



I2 = \\iiM-'^Px:'EP^})\ < { \_p[l'^ ) ' CiP-^. = C2P: 



(197) 



:i98) 
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From Lemma m we have 

l-C2P^ax<A2(M-^p)<l+c^pn^ 

We upper bound ||M^||2 as follows 



(199) 



||Mt||,=,/^<l + ,/^(M^<l+ P- 



p(Si) - - ' v p(Si) - - ■ ,/p(sr) 

Similarly, HM^^H^ ^ > 1 — C4Pmax. From Lemma [5l we have 

(1 - c.P-il) < < (1 + csP-il] 



n/2 



(200) 



(201) 



Since P is a joint distribution matrix, from Theorem [T], we know that A2(P) < 1. Therefore, 
we have 



;i-C4P^/x)(l-C2P,:ax)<A2(P)<l 



(202) 



When Pmax < 1, corresponding to the non-trivial case, hm„^oo Pmax = 0, and using fll89p . 
([32]) follows. 

The case P(5'2) < P{Si) can be proved similarly. ■ 



B.2 Proof of Theorem [15 

From (11651) . we know 



uv 



1 

V2 
1 
V2 



1 1 
V2 V2 



From (l29!l . we know 



Ifnyr, 



1 

2" 



A2(Pj 



UV) 



1 

1 



1 

V2 



1 
V2 



1 ■ ■ • 1 1 + 5^ A2(Pc/y) V.(Pt/"y")^^f (Pj 



1=2 



(203) 



(204) 



where G {1, 2, . . . , n}, for z = 2, . . . , 2". Due to the symmetric structure of Pu^v"-, we have 



^li{Pl 



i^i{Pu"V") , i — 2, . . . , 2" 



(205) 
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We also have 



Px^u" - ^ 



(206) 



where c is the product of the second singular value and the second right singular vector of 
PxiU"- Similarly, 



V"X2 



2n/2 



b +d y/T^ -b 



From (Ell), we know that 

PxiX2 =PxiU"Pu"V"Pv"X2 

a 



b 

/ 2" 

Y,HPuvy'^l^{Pu 



v^r^F -b (208 



, i=2 



Thus, we conclude that. 



Consider the following optimization problem, 

/ 2" 

max A = max | M{Puv)^^ IJ'i{Pu"V")^'i {Py^v^) j d 

We define 



i=2 



7i = C^Hi{Pl 



^ = 2,...,2" 



Then, 



X = Y,HPuvtl^Si 

i=2 



(207) 



(209) 



(210) 



(211) 
(212) 



(213) 
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We partition the set {2, . . . , 2"-} into two disjoint subsets, and £ , such that 

C if 7i()i < 



Hence, 



ieS+ i£S- 

< HPuv) Yl 



i=2 

4 X2{Puv) f , jnT 



c + d)^(c + d) 



<Mp^(l + c^d) (215) 

where 

1. because of the definition of and C~ in fl214p and < A2 (-Pc/y) < 1; 

2. because for non- negative jiSi, 

(%-'5*)' = 7- +<5- -27A>0 (216) 
Hence, by adding i^Si to both sides of the above inequality, we have 

(7^ + Sif > (217) 

3. due to the fact that (7^ + 5j)^ is non-negative for i G C~] 
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4. comes from the following derivation 

^{li + Sif = ^ ( C^/Xi(Pc/nyn) + d^U,{Punyn) ] 

1=2 1=2 ^ ^ 

2" . .2 

2" . .2 

^ f (c + d)Vi(^'c/"y") j 
i=i ^ ^ 



i=2 

(a) 



c + d)^MM^(c + d) 
^'^^c + d)^(c + d) (218) 



where 



(a) because both the vectors c and d are within the subspace spanned by singular 
vectors [fj,2{Pu"V") , " " " ? A*2"(-fi/"y")]5 thus 

(c + d)Vi(^c/"y") = (219) 

(b) because 

MM^ = I (220) 

5. because c^c = X2{PxiU")'^ and d-^d = X2{Pv"X2Y and from Theorem [H we know that 
the square of A2 is less than or equal to 1. 

From the above discussion, we conclude that 

max A<max + d) (221) 

c,d 2 

Thus, we can upper bound A by maxc,d ^^^^^(1 + c^d). 

From (fT2il . we know that PxiU^ is a non-negative matrix, i.e.. 



P 



^Vl - a'^e^ - ac^ 



> (222) 



where e is defined as a vector where all its elements are equal to 1, and for matrix A and 
-B, by A > B, we mean all the entries of the matrix A — B are non-negative. This property 
implies that 



1 1 - - A 1 



, =e > c = ,^ , -.e + c > (223) 
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Figure 3: Subset of simplex satisfying fl223p . 



We know that c is orthogonal to e, i.e., 

$^Q = (224) 



T 

c e 



i=l 

Hence, we see that the vector c is on the hyperplane that contains the point i^^j^^^ and 
is orthogonal to the vector e. On the other hand, fl223p shows that each coordinate of c is 
non- negative and less than or equal to ^;r72 ^^/j^ " '^^^S' vector c lies on a subset of 
simplex. See Figure [3] for a three-dimension illustration. 
By a symmetric argument, we have 

e>d= ^,. ^ e + d>0 (225) 



2"/2 5^/1352 - 2"/2 71^52 
Since c = ^^^e + c and d = ^^^e + d, 
1 a \^ / 1 h 



c d= — J- , e + c — J- , e + d 

+ _ , :e^d + , e^c + c^d 



v/(l-a2)(l-62) 2"/2 2"/2 v/r352 

+ c^d (226) 



Then, 



max c^d = max c^d , (227) 

c,d c,a v^(l-a2)(l-62) 
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The feasible sets of c and d are defined as follows, 

C^ix:^ — e > X > and e^x = 2"/^ , " I (228) 
T>^[^- ^ ^ ^ ^ P > V > n and e^x = 2"/2^^\ (229) 

Consider the following optimization problem 

max c^d (230) 

ceC,d6X> 

In the following, we will show that there exist C C C and V CV such that 

max c^d = max C"^d (231) 

If we assume that 

max c^d = max c^d Vd G P (232) 
cec cec 

max c^d = max C"^d Vc G C (233) 
dec dec 

and we also assume that the set C (V' respectively) does not depend on the value of d (c), 
then we have 

max C"^d = max max c^d 
cec.de© cec def 

= max max c d 
cec dec 

= max max c d 

dev cec 

= max max c d 
de©' cec 

= max c^d (234) 

ceC',deX" 

where 

1. because of (12331) : 

2. because we assume that the set V does not depend on the value of c; 

3. because of 



Now we need to show our assumptions, (I232p and (I233p . are valid, for which we need the 
following lemma. 
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Lemma 6 [22, p. 722] Let C be a convex subset o/M", and let C* be the set of minima of a 
concave function f : C i — > M over C. If C is closed and contains at least one extreme point, 
and C* is nonempty, then C* contains some extreme point of C. 

Here the extreme point is defined as follows: 

Definition 2 [22, p. 721] A vector x is said to be an extreme point of a convex set C if :k 
belongs to C and there do not exist vectors y G C and z G C, with y 7^ x and z 7^ x, and a 
scalar a G (0, 1) such that x = ay + (1 — a)z. An equivalent definition is that x cannot be 
expressed as a convex combination of some vectors of C, all of which are different from x. 

Thus, if we assume 

C = {extreme points of C} (235) 
V = {extreme points of V} (236) 

(12321) and (12331) will be satisfied. We observe that the set C (respectively, the set V), which 
consists of all the extreme points in the set C (in the set V ), does not depend on the value 
of d (c). 

Next, we determine the extreme point set C in the following lemma. 

Lemma 7 The setC consists of all the vectors, each of which contains 2"a^ non-zero entries 
with value ^VT^ ' '"^^^'^ ^'^ sufficiently large. 

Proof: We define the set C" as the set where each element contains 2"a^ non-zero entries 
equal to ^^^72 ^^/jz^ - K is easy to see that every vector in C" is within the set C. We need to 
show that any vector in the set C is a convex combination of some vectors in C". This can 
be proven by induction. It is easy to see that, if a vector such that 2" — 1 out of 2" entries 
take values from {0, ^^772 ^^7^^ }> the last entry will converge to 0, when n goes to infinity. 
Let s G C such that / out of 2" entries take values in (0, ^w^ av^T^ -^" '^^en, we choose any 2 
out of these / entries, which are equal to a and /5, respectively. If a + /3 < ^7772 ^^/j^^ ; then 

... Q/ ... p ... 

= ^-[--- ■■■ a + P ■■]+^^\--- c^ + P ■■■ ■■■1 (237) 
a + p L } a + p L J 
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lia + /3> 



1 1 

2"/2 aVT^ 



, then 



a 



2"/2 aVl^ 

1 1 



+ 



-/3 

a — /5 
— a 



— a — (3 



2"/2 av^^ 

■ a + /3 - 



a + /3 - 



1 1 



2"/2 aVT^ 



1 1 
2"/2 aVl^ 



1 



1 



2"/2 aVT^ 



(238) 



which means that s can be expressed as a convex combination of two vectors. These two vec- 
tors belong to set C and both of them have l — l out of 2" entries takes value in (0, 



2"/2 aVT^'' 

By induction, we can show that every vector in set C can be expressed as a convex combi- 
nation of some vectors in C" . On the other hand, it is easy to see that any vector s in C" 
cannot be expressed as a convex combination of some vectors in the set C other than s itself. 
Thus we conclude that C = C" . ■ 

Similarly, the set V consists all the vectors, each of which contains 2"6^ non-zero entries 
with value Then, 



1 



1 



and. 



max c^d = max c^d = min(a^, 6^) — j^=^= — -^=^= 

cec.de© cecdev ay I - 6v 1 - IP 



ah 



(239) 



max c d = max c d — 

c,d c6C,aec ^(1 - a2)(l - 62) 



: minfa^, 6^^ 



1 



ah 



ah^{l-a^){l-h^) v^(l -a2)(l -62) 

1 



: min(a^, 6^) min(l — a^, 1 — 6^ 



Hence, 



a6v/(l-a2)(l-62) 



(240) 



2 , (241) 

The lower bound of A can be derived in a similar manner. We rewrite (12081) in the following 
form 



X1X2 



b +(-A) 



—a 



(242) 
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By the same arguments as above, we obtain 



I min(l— a^,fe^) min(a^,l— 6^) 

-X<X2{Puv) ^^^^ (243) 

Combining (12411) and fl243p . we have 

I min(l— a^,fe^) min(a-^,l— 6-^) . min(a^ min(l— ,1— 6^) 

- HPuv) ^^V^^ < ^ < HPuv) ^^V^^ (244) 
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